Search CORE

3 research outputs found

THORN: Temporal Human-Object Relation Network for Action Recognition

Author: Bremond Francois
Dai Rui
Guermal Mohammed
Publication venue
Publication date: 20/04/2022
Field of study

Most action recognition models treat human activities as unitary events. However, human activities often follow a certain hierarchy. In fact, many human activities are compositional. Also, these actions are mostly human-object interactions. In this paper we propose to recognize human action by leveraging the set of interactions that define an action. In this work, we present an end-to-end network: THORN, that can leverage important human-object and object-object interactions to predict actions. This model is built on top of a 3D backbone network. The key components of our model are: 1) An object representation filter for modeling object. 2) An object relation reasoning module to capture object relations. 3) A classification layer to predict the action labels. To show the robustness of THORN, we evaluate it on EPIC-Kitchen55 and EGTEA Gaze+, two of the largest and most challenging first-person and human-object interaction datasets. THORN achieves state-of-the-art performance on both datasets

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

MultiMediate'23: Engagement Estimation and Bodily Behaviour Recognition in Social Interactions

Author: Alexandersson Jan
André Elisabeth
Balazia Michal
Baur Tobias
Brémond François
Bulling Andreas
Dietz Michael
Guermal Mohammed
Heimerl Alexander
Müller Philipp
Schiller Dominik
Thomas Dominike
Publication venue
Publication date: 16/08/2023
Field of study

Automatic analysis of human behaviour is a fundamental prerequisite for the creation of machines that can effectively interact with- and support humans in social interactions. In MultiMediate'23, we address two key human social behaviour analysis tasks for the first time in a controlled challenge: engagement estimation and bodily behaviour recognition in social interactions. This paper describes the MultiMediate'23 challenge and presents novel sets of annotations for both tasks. For engagement estimation we collected novel annotations on the NOvice eXpert Interaction (NOXI) database. For bodily behaviour recognition, we annotated test recordings of the MPIIGroupInteraction corpus with the BBSI annotation scheme. In addition, we present baseline results for both challenge tasks.Comment: ACM MultiMedia'2

arXiv.org e-Print Archive

THORN: Temporal Human-Object Relation Network for Action Recognition

Author: Bremond Francois,
Dai Rui
Guermal Mohammed
Publication venue: HAL CCSD
Publication date: 22/08/2022
Field of study

International audienceMost action recognition models treat human activities as unitary events. However, human activities often follow a certain hierarchy. In fact, many human activities are compositional. Also, these actions are mostly human-object interactions. In this paper we propose to recognize human action by leveraging the set of interactions that define an action. In this work, we present an end-to-end network: THORN, that can leverage important human-object and object-object interactions to predict actions. This model is built on top of a 3D backbone network. The key components of our model are: 1) An object representation filter for modeling object. 2) An object relation reasoning module to capture object relations. 3) A classification layer to predict the action labels. To show the robustness of THORN, we evaluate it on EPIC-Kitchen55 and EGTEA Gaze+, two of the largest and most challenging first-person and human-object interaction datasets. THORN achieves state-of-the-art performance on both datasets

INRIA a CCSD electronic archive server